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• 10/018929 
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^ Organic Compounds 

The present invention relates to DNA which encodes proteins that control gene silencing, 
and particularly the silencing of plant genes. 

The loss of expression of previously active genes in plants, also referred to as gene 
silencing, is observed in response to developmental, environmental or unknown signals. It 
occurs at a frequency higher than that of mutations, yet it is markedly stable during somatic 
transmission. Gene silencing, initially perceived as an unwanted source of instability of 
transgene expression, is now regarded as a molecular tool to intentionally regulate gene 
expression. 

It appears that chromosomal position or structure of the affected loci are factors determining 
the frequency and strength of silencing. Inactivation seems to preferentially affect genes 
present in multiple copies and is thought to be a consequence of sequence redundancy. 
Many examples of homology-dependent gene silencing have been reported. Closer 
analysis has allowed the classification of silencing events according to the relative position 
of the affected loci (c/s, trans, allelic, ectopic), the origin of the affected genes (endogenous 
or transgenic), and the level of interaction (transcriptional or post-transcriptional). While 
post-transcriptional silencing seems to mainly involve the formation of aberrant RNA 
molecules and is occasionally, but not necessarily, accompanied by DNA methylation, 
silencing interfering with transcription initiation is more strictly correlated with 
hypermethylation of the DNA and possibly with alteration of chromatin structure at the silent 
loci. It is, however, not clear whether these molecular events are a prerequisite for gene 
silencing or a consequence of the silent state. 

In the case of transcriptional silencing, the inactive state of silenced genes is stably 
transmitted through mitotic and meiotic divisions. As in other organisms, trans-acting 
modifier loci are assumed to be responsible for the stability of the inactive state of the 
silenced genes. Mutations in such loci resulting in mutated proteins are expected to result in 
reduced gene silencing and reactivation of previously silent loci by interfering with the 
maintenance of the silent state, or by a failure to recognize sequence redundancy, it has 
been reported that mutations in the DDM1 gene of Arabidopsis thaliana release 
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transcriptional gene silencing and that this genes encodes a SWI2/SNF2-like protein 
involved in chromatin remodeling. However, mutation of the DDM1 gene causes severe 
pleiotropic effects. Therefore, to be able to modify such effects making use of gene 
technology, it is necessary to identify further specific modifier loci and charactize the 
corresponding wild-type and mutant proteins. It is the main objective of the present 
invention to provide DNA comprising an open reading frame encoding such a protein. 

Trans-acting modifier loci according to the present invention can be identified by T-DNA 
insertion mutagenesis as described in Example 1 for an Arabidopsis line carrying a heritably 
inactivated, methylated hygromycin resistance gene. A mutation of a silencing modifier 
locus results in release of silencing of the hygromycin resistance gene and restores 
hygromycin resistance. Plants homozygous for the silent resistance gene are subjected to 
transformation with a selectable marker gene different from the hygromycin resistance 
gene, which is under the control of the T-DNA 1'-2' dual promoter. Transformants are 
selected and their progeny screened for hygromycin resistance. The mutant phenotype 
(hygromycin resistance) is screened for genetic co-segregation with a specific T-DNA insert. 
Cloning of the tagged gene using routine methods of recombinant DNA technology allows 
to characterize the mutant and wild-type DNA sequence of the silencing modifier locus as 
well as the encoded protein. 

Within the context of the present invention reference to a gene is to be understood as 
reference to a DNA coding sequence associated with regulatory sequences, which allow 
transcription of the coding sequence into RNA such as mRNA, rRNA, tRNA, snRNA, sense 
RNA or antisense RNA. Examples of regulatory sequences are promoter sequences, 5' and 
3' untranslated sequences, introns, and termination sequences. 

A promoter is understood to be a DNA sequence initiating transcription of an associated 
DNA sequence, and may also include elements that act as regulators of gene expression 
such as activators, enhancers, or repressors. 

Expression of a gene refers to its transcription into RNA or its transcription and subsequent 
translation into protein within a living cell. In the case of antisense constructs expression 
refers to the transcription of the antisense DNA only. 

The term transformation of cells designates the introduction of nucleic acid into a host cell, 
particularly the stable integration of a DNA molecule into the genome of said cell. 
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'Any part or piece of a specific nucleotide or amino acid sequence is referred to as a 
component sequence . 



DNA according to the present invention comprises an open reading frame encoding a 
protein characterized by an amino acid sequence comprising a component sequence of at 
least 150 amino acid residues having 40% or more identity with SEQ ID NO: 3. In particular 
the protein encoded by the open reading frame can be described by the formula R1-R2-R3, 
wherein 

-- R u R 2 and R 3 constitute component sequences consisting of amino acid residues 

independently selected from the group of the amino acid residues Gly, Ala, Val, Leu, lie, 
Phe, Pro, Ser, Thr, Cys, Met, Trp, Tyr, Asn, Gin, Asp, Glu, Lys, Arg, and His, 

- Rt and R 3 consist independently of 0 to 3000 amino acid residues; 

- R 2 consists of at least 150 amino acid residues; and 

-- R 2 is at least 40% identical to an aligned component sequence of SEQ ID NO: 3. 

In most cases the total length of the protein will be in the range of 1000 to 3000 amino acid 
residues. In preferred embodiments of the invention the component sequence R 2 consists 
of at least 200 amino acid residues. Specific examples of the component sequence R 2 are 
component sequences of SEQ ID NO: 3 represented by the following range of amino acids: 



1 


- 416 


(corresponding to exon 2); 


418 


- 583 


(corresponding to exons 3 to 5); 


584 


- 890 


(corresponding to exon 6); 


892 


- 1472 


(corresponding to exons 7 to 9); 


1007 


- 1472 


(corresponding to exon 9); 


1473 


- 1631 


(corresponding to exons 10 to 12); 


1632 


- 1827 


(corresponding to exons 13 to 15); and 


1829 


- 2001 


(corresponding to exon 16). 



In a preferrred embodiment of the present invention at least one of the component 
sequences Ri or R 3 comprises one or more additional component sequences with a length 
of at least 50 amino acids and at least 60% identical to an aligned component sequence of 
SEQ ID NO: 3. Specific examples of such additional component sequences are component 
sequences of SEQ ID NO: 3 represented by the following range of amino acids: 
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420 


- 525 


(corresponding to exons 3 and 4); 


444 


- 525 


(corresponding to exon 4); 


526 


- 583 


(corresponding to exon 5); 


892 


- 971 


(corresponding to exon 7); 


892 


- 1006 


(corresponding to exons 7 and 8); 


1473 


- 1524 


(corresponding to exon 10); 


1525 


- 1576 


(corresponding to exon 1 1); 


1577 


- 1631 


(corresponding to exon 12); 


1632 


- 1690 


(corresponding to exons 13); 


1692 


- 1757 


(corresponding to exons 14); and 


1758 


- 1827 


(corresponding to exons 15). 



Dynamic programming algorithms yield different kinds of alignments. In general there exist 
two approaches towards sequence alignment. Algorithms as proposed by Needleman & 
Wunsch and by Sellers align the entire length of two sequences providing a global 
alingment of the sequences. The Smith-Waterman algorithm on the other hand yields local 
alignments. A local alignment aligns the pair of regions within the sequences that are most 
similiar given the choice of scoring matrix and gap penalties. This allows a database search 
to focus on the most highly conserved regions of the sequences. It also allows similiar 
domains within sequences to be identified. To speed up alignments using the Smith- 
Waterman algorithm both BLAST (Basic Local Alignment Search Tool) and FASTA place 
additional restrictions on the alignments. 

Within the context of the present invention alignments are conveniently performed using 
BLAST, a set of similarity search programs designed to explore all of the available 
sequence databases regardless of whether the query is protein or DNA. Version BLAST 2.0 
(Gapped BLAST) of this search tool has been made publicly available on the internet 
(currently http://www.ncbi.nlm.nih.gov/BLAST/). It uses a heuristic algorithm which seeks 
local as opposed to global alignments and is therefore able to detect relationships among 
sequences which share only isolated regions. The scores assigned in a BLAST search have 
a well-defined statistical interpretation. Particularly useful within the scope of the present 
invention are the blastp program allowing for the introduction of gaps in the local sequence 
alignments and the PSI-BLAST program, both programs comparing an amino acid query 
sequence against a protein sequence database, as well as a blastp variant program 
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'allowing local alignment of two sequences only. Said programs are preferably run with 
optional parameters set to the default values. 



Sequence alignments using BLAST can also take into account whether the substitution of 
one amino acid for another is likely to conserve the physical and chemical properties 
necessary to maintain the structure and function of the protein or is more likely to disrupt 
essential structural and functional features of a protein. Such sequence similarity is 
quantified in terms of a percentage of "positive" amino acids, as compared to the 
percentage of identical amino acids and can help assigning a protein to the correct protein 
family in border-line cases. 

Sequence alignments using such computer programs reveal the presence of an ATP/GTP- 
binding motif A (amino acids 460 to 467 in SEQ ID NO:3), the consensus sequence of 
which is (Ala/Gly)XaaXaaXaaXaaGlyLys(Ser/Thr), wherein (Ala/Gly) indicates Ala or Gly, 
Xaa indicates any naturally occuring amino acid and (Ser/Thr) indicates Ser or Thr. 
Alignment additionally reveals a region (amino acid position 479 to 719 in SEQ ID: 3), 
similiarto part of the ATPase/helicase domain of proteins in the SWI2/SNF2 family which 
are involved in chromatin remodeling but no significant overall sequence identity with known 
proteins. 



Specific examples of DNA according to the present invention are described in SEQ ID 
NO: 1 and SEQ iD NO: 2 encoding an Arabidopsis protein described in SEQ ID NO: 3. 
Stretches of SEQ ID NO: 3 having 50 to 500 amino acids length can show between 20 and 
50% sequence identity to stretches of known protein sequences after alignment. Overall 
alignments of SEQ ID NO: 3, however, result in sequence identities lower than 30%. Thus, 
the present invention defines a new protein family the members of which are characterized 
by an amino acid sequence comprising a component sequence of at least 150 amino acid 
residues having 40% or more identity with an aligned component sequence of SEQ ID 
NO: 3. Preferably the amino acid sequence identity is higher than 50% or even higher than 
55%. 



DNA encoding proteins belonging to the new protein family according to the present 
invention can be isolated from monocotyledonous and dicotyledonous plants. Preferred 
sources are corn, sugarbeet, sunflower, winter oilseed rape, soybean, cotton, wheat, rice, 
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potato, broccoli, cauliflower, cabbage, cucumber, sweet corn, daikon, garden beans, 
lettuce, melon, pepper, squash, tomato, or watermelon. However, they can also be isolated 
from mammalian sources such as mouse or human tissues. The following general method, 
can be used, which the person skilled in the art knows to adapt to the specific task. A single 
stranded fragment of SEQ ID NO: 1 or SEQ ID NO: 2 consisting of at least 15, prefeably 20 
to 30 or even more than 100 consecutive nucleotides is used as a probe to screen a DNA 
library for clones hybridizing to said fragment. The factors to be observed for hybridization 
are described in Sambrook et a!, Molecular cloning: A laboratory manual, Cold Spring 
Harbor Laboratory Press, chapters 9.47-9.57 and 11.45-11.49, 1989. Hybridizing clones are 
sequenced and DNA of clones comprising a complete coding region encoding a protein 
characterized by an amino acid sequence comprising a component sequence of at least 
150 amino acid residues having 40% or more sequence identity to SEQ ID NO: 3 is purified. 
Said DNA can then be further processed by a number of routine recombinant DNA 
techniques such as restriction enzyme digestion, ligation, or polymerase chain reaction 
analysis. 

The disclosure of SEQ ID NO: 1 and SEQ ID NO: 2 enables a person skilled in the art to 
design oligonucelotides for polymerase chain reactions which attempt to amplify DNA 
fragments from templates comprising a sequence of nucleotides characterized by any 
continuous sequence of 15 and preferably 20 to 30 or more basepairs in SEQ ID NO: 1 or 
SEQ ID NO: 2. Said nucleotides comprise a sequence of nucleotides which represents 15 
and preferably 20 to 30 or more basepairs of SEQ ID NO: 1 or SEQ ID NO: 2. Polymerase 
chain reactions performed using at least one such oligonucleotide and their amplification 
products constitute another embodiment of the present invention. 

EXAMPLES : 

Example 1 : T-DNA Insertion 

Transgenic line A of Arabidopsis thaliana ecotype Zurich with a transcriptionally silenced 
locus containing multiple copies of a chimeric hygromycin phosphotransferase gene (hpt) 
has been described in Mittelsten Scheid et al, Mol Gen Genet 228: 104-112, 1991 and 
Mittelsten Scheid et al, Proc Natl Acad Sci USA 93: 7114-7119, 1996. A homozygous, 
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p dip!oid genotype of said line is subjected to Agrobacterium mediated gene transfer by in 
pfanta vacuum infiltration (Bechtold et ai., C R Acad Sci Paris Life Science 316: 11 94-1 1 99, 
1993) generating more than 4000 independent T-DNA transformants. The binary vector with 
T-DNA consisting of the coding region of the bar gene transcriptionally fused to the V 
promoter (pl'barbi), the Agrobacterium strain (C58CIRif R ) and the transformation protocol 
are described by Mengiste et al, Plant J 12: 945-948, 1997. Transformants (T1 plants) are 
selected by repeated spraying of germinated seedlings with Basta solution (150 mg/l) and 
grown to maturity. 

Example 2: Mutant Selection 

Selfed seeds (T2 families) are collected from individual transformants. Prior to screening for 
revertants of the silenced phenotype, seeds are dried for one week at room temperature 
and cold-treated at 4°C for a minimum of one week. Pooled aliquots of approximately 1000 
seeds (consisting of 50 seeds from 20 T2 families) are surface-sterilized twice (with 5% 
sodium hypochlorite containing 0.1% Tween 80) for 7 min and washed with sterile double- 
distilled water. For selection, each aliquot is plated on 14-cm Petri dishes containing 75 m! 
germination medium (according to Masson et al, Plant J 2: 829-933, 1992) solidified with 
0.8% agar and containing 10 mg/l hygromycin B (Calbiochem). To ensure equal distribution 
during sowing, seeds are mixed with 30 ml of the same medium containing 0.4% agar. As 
positive control two seeds from a hygromycin-resistant line are sown at marked locations on 
each plate. Plates are cold-treated at 4°C for 2 days and subsequently subjected to 
alternating periods of 16 hours light at 21 °C and 8 hours darkness at 16°C. Hygromycin 
resistance is evaluated each day for 8-15 days after sowing. 

Example 3: Molecular and Genetic Analysis of the Mutant 

Following identification of 1 1 hygromycin-resistant seedlings in one of the pools, the families 
forming this pool are re-screened individually. One family contains approximately 25% 
hygromycin-resistant seedlings. Six resistant plantlets of this family are transferred to larger 
vessels containing germination medium without hygromycin. After rosette formation and 
development of the root system, plants are transferred to soil for further growth and seed 
setting. Prior to potting, tissue explants are taken from each plant to generate callus 
cultures on RCA medium (Table 1) with or without 10 mg/l hygromycin B. Callus cultures are 
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used as a source of material for DNA and RNA analyses and for a further confirmation of 
hygromycin resistance in this tissue. 

Genomic DNA is isolated using a CTAB based method as described by Mittelsten Scheid et 
al, Mol Gen Genet 244: 325-330, 1994, and incubated with restriction enzymes BamHI, 
Hpall, Mspil, Drai, EcoRV, Real or Hindi 1 1. Total RNA is obtained using a RNAeasy kit 
(Qiagen) according to the supplier's recommendation. Southern and northern blot analysis 
are performed under conditions described by Church and Gilbert, Proc Natl Acad Sci USA 
81: 1991-1995, 1984, using DNA fragments labelled with 32 P by random prime labeling. The 
coding region of the hpt gene, or DNA consisting of the P35S promoter, hpt coding and 
terminator region, or the coding region of the bar gene together with the V promoter are 
used as probes. 

Northern blot analysis of 4 hygromycin-resistant siblings shows restoration of transcription 
of the hpt gene. Southern blot analysis of said siblings indicates that there is no detectable 
rearrangement within the complex hpt insert. The hpt transgene complex in the mutant is 
still hypermethylated like in the original line A, as judged by Southern blot analysis with the 
methylation-sensitive restriction enzymes Hpall and Mspl, and by genomic sequencing of 
the promoter region aftertreatment with bisulfate. There is also no influence of the mutation 
on the methylation of repetitive genomic DNA in contrast to that observed for the som 
mutations. 

The hygromycin-resistant plants, as well as non-selected siblings from the same family are 
grown to set seeds, checked for Basta resistance in the next generation, and scored for the 
number and size of the T-DNA inserts by Southern analysis. The results demonstrate that 
the original T-DNA transformant must have contained 2 T-DNA insertions segregating 
independently in the siblings. One insert co-segregates with the hygromycin resistant 
mutant phenotype. A plant homozygous for this insert and lacking the other T-DNA insert, is 
used for cloning the corresponding T-DNA insertion site. 

Table 1 : Composition of RCA medium 



RCA medium 

MS macro 10 x 
B5 micro 1000 x 



100 ml 
1 ml 
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ferric citrate 


5 ml 


NT vitamins 1 00 x 


10 mi 


sucrose 


10g 


MES 


5 ml 


agar 


10g 


NAA 


0.1 mg 


BAP 


1 mg 


pH 5.8 (KOH) 




ad 1 I 




MS macro 10 x 




potassium nitrate 


19 g 


ammonium nitrate 


16.5 g 


calcium chloride (x 2 H 2 0) 


4.4 g 


magnesium sulfate (x 7 H 2 0) 


3.7 g 


potassium dihydrogen phosphate 


1.7 g 


ad 1 I 




B5 micro 1000 x 




magnesium sulfate (x H 2 0) 


1000 mg 


boric acid 


300 mg 


zinc sulfate (x 7 H 2 0) 


200 mg 


potassium iodide 


75 mg 


sodium molybdate (x 2 H 2 0) 


25 mg 


copper sulfate (x 5 H 2 0) 


2.5 mg 


cobalt chloride (x 6 H 2 0) 


2.5 mg 


ad 100 ml 




ferric citrate 




ammonium iron citrate 


10 g 


ad 1 I 




NT vitamins 100 x 




myo-inositol 


1000 mg 


thiamine HCI 


10 mg 


ad 1 I 




MES 




MES 


14g 


pH 6 (NaOH) 




ad 100 ml 
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Example 4: Cloning of the "Silencing Gene" 

Genomic DNA from the plant containing only the T-DNA co-segregating with the hygromycin 
resistant mutant phenotype is isolated. The DNA is subjected to TAIL (thermal asymmetric 
interlaced) PCR according to Liu et al, Plant J 8: 457-463, 1995, using 3 specific, nested 
primers close to the right border of the T-DNA (5 ' -cat cta cgg caa tgt acc agc-3 ' 

(SEQ ID NO: 4), 5 ' -GAT GGG AAT TGG CTG AGT GGC-3' (SEQ ID NO: 5), 5 ' -CAG 

ttc CAA acg taa aac GGC-3 ' (SEQ ID NO: 6)) which are directed outwards, and one 
of several degenerate primers which might bind in flanking plant DNA. Two out of the 
following seven degenerate primers 



AD1 5'-ntc gas twt sgw gtt-3 ' (Liu et al supra; SEQ ID NO: 7) 

AD2 5'-ngt cga swg ana wga a- 3' (Liu et al supra; SEQ ID NO: 8) 

AD3 5 ' -wgt gna gwa nca nag A- 3 ' (Liu et al supra; SEQ ID NO: 9) 

AD4 5'-WGG WAN CWG AWA NGC A- 3' (SEQ ID NO: 1 0) 

ADS 5 ' -WCG WWG AWC ANG NCG A- 3 ' (SEQ ID NO: 11) 

AD6 5'-WGC NAG TNA GWA NAA G-3 ' (SEQ ID NO: 12) 

AD7 5 ' -AWG CAN GNC WGA NAT A- 3 ' (SEQ ID NO: 13) 



actually result in amplification of specific fragments. The larger one obtained using AD7 is 
cloned and sequenced. It contains 50 bp of the T-DNA and 275 bp of flanking plant DNA. In 
Southern blot analysis it is shown that this PCR fragment contains the plant DNA flanking 
the T-DNA. The PCR fragment is used to screen a genomic library (Stratagene) of wild type 
Arabidopsis thaliana ecotype Columbia. Three genomic clones hybridizing to the PCR 
fragment are identified. The genomic clones are further mapped with restriction enzymes, 
hybridized to the PCR fragment and aligned to each other. In one of the genomic clones 
obtained (p4A-11), the sequence found to flank the T-DNA of the insertion mutation is 
located approximately in the middle of the genomic sequence. An approximately 800 bp 
EcoRI-Sal I fragment of p4A-1 1 is used to obtain the overlapping genomic clone p5-6, and 
an approximately 700 bp EcoRI fragment of p5-6 is used to obtain genomic clone p30-1 
overlapping with p5-6. An approximately 700 bp Hindlll fragment of p30-1 is used to obtain 
the genomic clone p33-19 overlapping with p30-1. Said clones are sequenced to design 
primers for RT-PCR. The approximately 700 bp EcoRI fragment of p5-6 is further used for 
screening of a cDNA library according to Elledge et al, Proc Natl Acad Sci USA 88: 1731- 
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^1735, 1991). Nine cDNA clones are obtained and the longest clone p17-8 having a length 
of 2.6 kb is sequenced. 

Example 5: Sequence Analysis and Alignments 

Taking into account the large size of the Arabidopsis silencing gene cloned above it cannot 
be entirely excluded that the authentic nucleotide and amino acid sequences of the gene 
and protein, respectively, might deviate from the sequences given in SEQ ID NO: 1, SEQ ID 
NO: 2, and SEQ ID NO: 3 at a few positions due to mutations arising from the cloning 
procedure or due to ambiguities in the sequencing reactions. 

The 2.6 kb cDNA clone is analyzed sequentially from both ends and is shown to contain 
one large ORF as well as a 3' untranslated sequence. 

Analysis of the genomic clones reveals that clones p4A-1 1 and p5-6 contain sequences 
homologous to the cDNA sequence as well as 7 intron sequences. Comparing the genomic 
sequences with the DNA sequences flanking the T-DNA insert, it turns out that the T-DNA 
insertion causes a deletion of about 2 kb of genomic DNA. The 5* end of the deletion is located in 
an intron (intron 12) and the 3' end of the deletion is located downstream of the 3* end of the 
cDNA. The sequence of 5' end of the cDNA clone terminates in the middle of the sequence of th 
genomic clone p5-6. Three independent nested RT-PCR reactions are performed to obtain 
additional cDNA sequences further upstream. The sequences of the primers used for these RT- 
PCRs are as follows: 



RT1-1 


5 


-CTGTACATACTGAGTACAATCGGA- 3 




(SEQ 


ID 


NO: 


14) 


RT1-2 


5 


-GCTTCAATTCCTGCCTCAGTTGAAC - 


3 ' 


(SEQ 


ID 


NO: 


15) 


RT1-3 


5 


-CTCTACGTGCTTAACATCATGCGA-3 




(SEQ 


ID 


NO: 


16) 


RT1-4 


5 


-CCAGCTTCTGCTACTAGAAAGTCAG- 


3 ' 


(SEQ 


ID 


NO: 


17) 


RT2/3-1 


5 ' 


-CTGGAGTTGCATGAAATCCTGGATG- 


3 ' 


(SEQ 


ID 


NO: 


18) 


RT2/3-2 


5 ' 


-GCTCTTTGTAAGCTGTTCACGAGAC- 


3 ' 


(SEQ 


ID 


NO: 


19) 


RT2-3 


5 ' 


-TCGCATGATGTTAAGCACGTAGAG-3 




(SEQ 


ID 


NO: 


20) 


RT2-4 


5 ' 


-GAGTACTGGTCCGTGAACAGGTAAT- 


3 ' 


(SEQ 


ID 


NO: 


21) 


RT3-3 


5 ' 


-ATGCTTGCACAAGCATGGTCGGAAA- 


3 ' 


(SEQ 


ID 


NO: 


22) 


RT3-4 


5 ' 


-TGCAACATCGTGCATTTGCTCCAGA- 


3 ' 


(SEQ 


ID 


NO: 


23) 


RT4-1 


5 ' 


-CACAAGGATGAGTTTTTCCTTCCGG- 


3 ' 


(SEQ 


ID 


NO: 


24) 


RT4-2 


5 ' 


-CTGACTTTCTAGTAGCAGAAGCTGG- 


3 ' 


(SEQ 


ID 


NO: 


25) 
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Sequences of several parts of the genomic clones are found to be deposited in the ^ 
Arabidopsis database (accession numbers B67281, B62563, B20434, B20425, B21274, 
B08967, B11993, B20116, B12496 and B10852 as end sequences of BAC, and Z18494 
and AA597930 as partial cDNA sequences, on 13 Apr 1999). A comparison of the encoded 
protein sequence with the Swiss Protein Database reveals partial similarity with 
ATPase/helicase proteins of the SWI2/SNF2 family (amino acid position 479 to 719 in SEQ 
ID NO: 3). The encoded protein consists of 2001 amino acids and is calculated to have a 
molecular weight of 219 kD and a pi of 5.1. An ATP/GTP-binding motif (amino acid position 
460 to 467 in SEQ ID NO: 3) and three nuclear localization motifs (amino acid positions 362 
to 367, 832 to 838 and 858 to 862 in SEQ ID NO: 3) are found in the encoded protein. 
Similarity to the actin binding domain of chicken tensin (amino acid position 1899 to 1941 in 
SEQ ID NO: 3) and a predicted membrane spanning domain (amino acid position 995 to 
1015 in SEQ ID NO: 3) are also detected. 

Example 6: Homologous genes in other species 

The cDNA clone is used to probe genomic DNA from turnip, tomato, tobacco, maize, 
mouse, fruit fly and man for the presence of homologous genes by Southern blot analysis. 
Hybridization under conditions of low stringency is found in all cases. Cross-hybridizing 
clones from libraries can be identified and sequenced. 

Example 7: Manipulating marker gene expression by antisense constructs 

The 2.6 kb cDNA fragment and a 1 .8 kb RT-PCR fragment amplified by a nested RT-PCR using 
primers RT1 -1 and RT1 -2 for the first PCR and primers RT1 -3 and RT1 -4 for the second PGR, are 
each inversely cloned into the multiple cloning site of the binary vector pbarbi53 to generate 
antisense RNA. pbarbi53 is a modified vector of pl'barbi and carries an expression cassette 
consisting of the 35S promoter of cauliflower mosaic virus, a multiple cloning site containing Xho I. 
SnaBI, Hpa I and Cla I restriction sites and the 35S terminator of cauliflower mosaic virus at the 
Hindlll site of pl'barbi. The resulting recombinant plasmids are introduced into Agrobacterium as 
described in Example 1 . The transgenic plant line GUS-TS (obtainable from Dr. H. Vaucheret, 
INRA, Versailles Cedex, France) of Arabidopsis thaliana ecotype Colombia containing a 
transcriptionally silenced locus with multiple copies of a chimeric beta-glucuronidase (gus) gene, is 
transformed with the recombinant plasmids as described in Example 1 and transformants are 
selected as described by Mengiste et al, Plant J 12: 945-948, 1 997. pbarbi53 vector DNA is used 




♦ in control transformations. The transformants are examined for reactivation of the gus gene by 
histochemical staining. A cotyledon leaf is soaked in gus staining solution (100 mM sodium 
phosphate buffer (pH 7.0), 0.05% 5-bromo-4-chloro-3-indolyl-beta-D-glucuronidase, 0.1% sodium 
azide) under vacuum for 1 0 min and then incubated at 37 9 C overnight. While strong gus activity is 
observed in the plants transformed with the recombinant plasmid carrying the 2.6 kb cDNA, plants 
transformed with the recombinant plasmid carrying the 1 .8 kb RT-PCR fragment or pbarbi53 do 
not show any gus activity above background. Therefore, expression of the antisense RNA of the 
2.6 kb cDNA mimicks the mutant phenotype and confirms that sequences shown in SEQ ID NO: 1 , 
SEQ ID NO: 2 and SEQ ID NO: 3 represent the genetic information for a component of the 
transcriptional gene silencing system. 




What is claimed is: 

1. DNA comprising an open reading frame encoding a protein characterized by an ammo 
acid sequence comprising a component sequence of at least 150 amino acid residues 
having 40% or more identity with an alinged component sequence of SEQ ID NO: 3. 

2. The DNA according to claim 1 comprising an open reading frame encoding a protein 
having the formula R1-R2-R3, wherein 

~ R lp R 2 and R 3 constitute component sequences consisting of amino acid residues 
independently selected from the group of the amino acid residues Gly, Ala, Val, Leu, 
lie, Phe, Pro, Ser, Thr, Cys, Met, Trp, Tyr, Asn, Gin, Asp, Glu, Lys, Arg, and His, 

- R1 and R 3 consist independently of 0 to 3000 amino acid residues; 

- R 2 consists of at least 150 amino acid residues; and 

- R 2 is at least 40% identical to an aligned component sequence of SEQ ID NO: 3. 

3. The DNA according to claim 1 comprising an open reading frame encoding one or more 
SWI2/SNF2-iike ATPase/helicase motifs. 

4. The DNA according to claim 1 , wherein the open reading frame encodes a protein 
characterized by the amino acid sequence of SEQ ID NO: 3 

5. The DNA according to claim 1 characterized by the nucleotide sequence of SEQ ID 
NO: 1 or SEQ ID NO: 2. 

6. The DNA according to claim 1 , wherein expression of RNA, complementary to mRNA 
transcribed therefrom, releases silencing of a transgenic marker gene. 

7. The protein encoded by the open reading frame of any one of claims 1 to 6. 

8. A method of producing DNA according to claim 1 , comprising 

- screening a DNA library for clones which are capable of hybridizing to a fragment of 
the DNA defined by SEQ ID NO: 1 or SEQ ID NO: 2, wherein said fragment has a 
length of at least 15 nucleotides; 

- sequencing hybridizing clones; 

- purifying vector DNA of clones comprising an open reading frame encoding a protein 
characterized by an amino acid sequence comprising a component sequence of at 
least 150 amino acid residues having 40% or more sequence identity to SEQ ID 
NO: 3 

- optionally further processing the purified DNA. 
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'9. A polymerase chain reaction wherein at least one oligonucleotide used comprises a 
sequence of nucleotides which represents 15 or more basepairs of SEQ ID NO: 1 or 
SEQ ID NO: 2. 
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Abstract 

The present invention relates to DNA which encodes proteins involved in gene silencing. Related 
genes encoding proteins characterized by an amino acid sequence comprising a component 
sequence of at least 150 amino acid residues having 40% or more identity with an alinged 
component sequence of SEQ ID NO: 3 can be isolated from different sources such as mammalian 
or plant cells. Further disclosed is a method for isolating DNA according to the invention. 
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SEQUENCE LISTING 



<110> Novartis AG 

Novartis Research Foundation 

<120> Organic Conpounds 

<130> S-31005P1 

<140> 
<141> 

<160> 25 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 10329 
<212> VNA 

<213> Arabidopsis t±ialiana 
<220> 

<221> intron 

<222> (1009) . . (1295) 

<220> 

<221> intron 

<222> (2551) . . (2673) 

<220> 

<221> intron 

<222> (2753) . . (2867) 

<220> 

<221> intron 

<222> (3114) . . (3506) 

<220> 

<221> intron 

<222> (3681) . . (3973) 

<220> 

<221> intron 

<222> (4896) . . (4975) 

<220> 

<221> intron 

<222> (5218) . . (5777) 

<220> 

<221> intron 

<222> (5883) . . (6082) 



<220> 
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<221> intron 

<222> (7481) . . ;75:5; 

<220> 

<221> intron 

<222> (7772) . . (7914) 

<220> 

<221> intron 

<222> (8071) . . (8153) 

<220> 

<221> intron 

<222> (8319) . . (8451) 

<220> 

<221> intron 

<222> (8630) . . (8718) 

<220> 

<221> intron 

<222> (8919) . . (9000) 

<220> 

<221> intron 

<222> (9212) . . (9284) 

<400> 1 

aatatttaag tttggtttat attctttcta 
cttctaataa ataacattgg atttattgga 
tacagtgtat tttggaacga ccaaaatgat 
tagtgtaata ggatagcgga caaggttgat 
gcagtggtta cagtctactg tcgaggccaa 
aatattttga tgatgagtac cacaatcaaa 
tgattgaata ctacgaatgc agaacatata 
tttgttttta tcatttttga atacacgaag 
tttgttctat ttaatcttca attctagcaa 
tagtatagta taaaaattac aaatttcaaa 
ggtgtaacat ttcgttaatt tcacataaca 
aagtatttta taacttaaaa tatataaatt 
tccttccggt aatcgtaaaa tcaaaaatcg 
gaaagtaccg tacataatcc tgcgaaccca 
aaaccccaaa caccgcgagg attgcatggc 
ggaattctca aattttccct cgcgtttttc 
caagctccgt caagcgatag attctgacaa 
ccctgtttta ggttggtgtt aatctatcgg 
cgttttaggt ttttcagaga atcttatcta 
tgtaactcat tagttttgca tataggaatt 
agacattttg tttgatggtc ttcttcggtg 
tggttgctaa ggtcctttcc gttgcgtgct 
tggtttgacg gggagaacca tttacaccag 
tgaacaagaa acccctggtt tgaggaggtc 
aactccagct tctgctacta gaaagtcaga 
aaaaaagtcc ggtggaatcg tcaagaattc 
ggggaagact gaagtatcct tgcagagttc 



gtaatctttg aaatattgta agagataatg 60 
attaatgtat tgaaaaaact atgcaaatac 120 
atatgtaaac tttcgttcta gtcttctaca 180 
cgactctaaa cattatgggt acgtaattcc 240 
actggtaatt aaacgtttga agtttagaga 300 
gatgataggt gttaatcact gtaaaaatgt 360 
catattttta atctctttgg aatttttgtt 420 
agctcagtta tatttcatat tgtatatgaa 480 
catactctta tgctaattcg tttcatattt 540 
acaaactata agtaatatac taacatagtc 600 
tatgttaatt acatatgtac actatttttg 660 
taaatctaag aaatcacaag catgagtttt 720 
ctcgctcgag aaacgccggt gctagaagag 780 
attctcgtct tcttcaaact cagttttccg 840 
ctgaagaacc acttaatcga gaattgtgct 900 
tttcacactc tcggaatcgg aaatttccac 960 
ttacacactt tcgcgcaggt atgcttcctt 1020 
tgaatcgaag gttttgggcc tcgggctttg 1080 
cttggggatg gatcttaggc gtttgttaga 1140 
ttgatttgaa agttaggtcg ccggatttgt 1200 
ctcacattct ttgtttttaa gtgcttgatt 1260 
ctcagtgaat atgaagaaag atgaaaagat 1320 
atccctagca gcttcaattc ctgcctcagt 13 80 
aagccggggg acaccatcta cgaaggtaat 1440 
gagactggct ccctcacctg cttcagtttc 1500 
cacaccaagt tctttgcgaa ggtccaatag 1560 
caaaggatca gataattcta tcaggaaagg 1620 
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• agatacttca ccggatattg agcagagaaa ggatagtgtt gaagagtcga cagataagat 1680 
caagcctata atgtcagccc gaagttacag ggcattgttt agagggaagc tcaaggaatc 1740 
tgaggcatta gttgatgctt ccccaaatga agaggaacta gtagttgttg gttgttctcg 1800 
ccgcatacct gcaggcaatg atgatgttca aggtaaaaca gattgtccac cacctgcaga 1860 
tgcaggatca aaaaggctgc cagttgacga aactagtttg gacaagggca ctgattttcc 192 0 
tttgaaatca gttacggaga ccgagaagat agtgcttgat gcatccccca tagttgaaac 1980 
tggggatgac agtgttatag gttcaccatc tgagaattta gagacacaaa agcttcaaga 2040 
tggtaagaca gattgttcac cacctgcaaa tgcagaatcg aaaacgctgc cagttggtga 2100 
aactagttta gaaaaagaat atccacaaaa gtttcaagat gataacacag attgtctacc 2160 
acctgcaaat gcagaatcaa aaaggctgcc agttggcgaa actagtttag aaaaggacac 2220 
tgattttcct ttgaaatcaa ctacggagac tggaaagatg gttctttatg catcccccat 2280 
agttgaaact agggatgaca gcgttatatg ttcaccatct acaaatttag aaacccaaaa 2340 
gcttcttgtc agtaaaactg gcttagaaac cgacatagtt ttgcctttga aaagaaaaag 2400 
agacactgca gaaattgagc tggatgcatg tgctacagtt gcaaatggag atgatcacgt 2460 
tatgagttct gatggggtca ttccatctcc atctgggtgc aaaaatgata atcgacctga 2520 
aatgtgcaac acgtgtaaaa aacggcaaaa gtaagagttt ttttagtgtt gtctgtctat 2580 
tgaaacgatc tgccaatgtt gaatgttggg cagatgggtt tgattcttag gatatatgtt 2640 
ctgtattgta atgagttgtt caaaattttg aagggtcaac ggtgattgtc aaaataggag 2700 
tgtttgctcc tgcattgtcc agccagttga agaatctgat aacgtgacac aggttggttt 2760 
ctaattactt tcggagaccc gttaatcagt ggactcttaa atagttagat actagattta 2820 
cttatccttt tacttgtaat ctgcaattct attttgcatt tgattaggat atgaaagaaa 2880 
ctggaccagt tacgagcaga gaatatgagg agaacgggca aatacaacat ggtaaatcaa 2940 
gtgatcccaa attctattct tcggtgtacc cagagtattg ggttcctgtg cagctatcag 3000 
atgtacagct ggagcaatac tgtcagactc tcttctccaa atccttatct ctttcttcac 3060 
tttcgaagat tgatcttgga gctctagaag aaactctcaa ttctgtaaga aaagtaagtt 3120 
acttgatttt aaaaacactt attcttcaat gcacttgtga gttaagtacc cagttattac 3180 
tggtgataag ataaagaaag caatagaaaa attgataagg tgttcaccgc attgcagcca 3240 
aaaaaacaat tctgtgttcc atgctttcaa gaggttgtca cataggtgtt atgcctttct 33 00 
gtttgatgtt tggtagagca aaggttttgg gtctatttgt tttatgcttt tttgaaacac 3 360 
atagaacctg gcaaacttga cagttttggg gttgcttaga tatacgacta ttgtcggtca 3420 
gcatcacatt ttctcaaggc ctctttctgc atgttaatgt gtgaatatat taaaatcttc 3480 
tttatgtgtt tgcaacttgt tgacagacct gtgaccatcc atacgttatg gatgcatctt 3540 
tgaaacaact gctcaccaag aatctggagt tgcatgaaat cctggatgta gaaattaaag 3600 
cgagcgggaa acttcacctc cttgataaaa tgcttactca tataaaaaag aatggtttaa 3660 
aagcagttgt cttctaccag gtgcattttc tattacttgc gaatgtgaat agctctatgt 3720 
ttgtcatgaa tacgtcactt tgtgcattct caatatatgt gcattttctt tttgacaatg 3780 
gaattctgtc ttgtattgaa atttgagtgg gatgaaagta tgctttttat cgtgcaatta 3840 
tgaagtgtaa gttagccttc agcagtcagc tagcattatg agatatgctg aactaaaatg 3900 
tttcttttct cttctttctt tttcgttata tgtgcctcat gtatgtttga attacagttt 3 960 
ttattttcag caggcaacac aaacccctga agggcttctg cttggtaata ttctcgaaga 4020 
ttttgtgggt caaagatttg gtccaaaatc ttatgagcat gggatatatt cctcaaagaa 4080 
gaactccgct ataaacaatt tcaacaagga gagtcaatgc tgtgttctgc tgttggaaac 4140 
acgtgcctgc agtcaaacca ttaaactctt gcgagctgat gcgtttattc tttttggaag 4200 
cagcttgaat ccatcgcatg atgttaagca cgtagagaag ataaaaatcg agtcatgttc 4260 
tgaaagaact aagatattcc gattgtactc agtatgtaca gttgaagaaa aagccctgat 4320 
tctggctagg caaaatatgc ggcaaaataa agctgtagag aacctaaacc gctctctcac 4380 
gcacgcactg ctcatgtggg gggcgtcata cttatttgat aaactggatc attttcacag 4440 
cagtgaaact ccagattcag gagtttcatt tgaacaatct attatggacg gcgtgattca 4500 
tgaattctcg tccatacttt cttccaaagg tggagaagaa aatgaagtca agctgtgtct 4560 
acttttggag gccaagcatg ctcagggaac ttacagcagt gattctactc tatttggtga 4620 
agaccatatt aagttgtcag atgaagagag tccaaatata ttttggtcaa agctgttggg 4680 
gggaaaaaat cctatgtgga aatacccttc agatactccc caaaggaatc gaaaacgagt 4740 
tcagtatttt gagggttctg aagcgagtcc caaaactggc gatggtggaa atgcaaagaa 4800 
gcgaaagaag gcttctgatg atgtcactga tccccgggtc actgatccgc cagtagatga 4860 
tgatgaaaga aaggcctctg ggaaggatca catgggtaaa atagtttaat ttctgctccg 4920 
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atacctctag tzgntcattga ttatgcaact 
ttggagtcac caaaagtcat aaca:t:ccag 
acatcggatg gaaatgatgc ttttggcctg 
ccagaggata tgttagctag tcaagattgg 
ctccacactg ttttaaagcc gaagatggca 
agtggccttt ttcacctcca caacttattt 
tgcaactgta gttgttacct gatttcctgt 
actatatcca tccgtttaag catattttat 
tgttaaaatg agcaactgct gcacaaaaca 
tgtttattga agcaaagaaa tttctgtcta 
actatatata accttaggcc tttgtacctg 
gttcgctgtt ggtatagaaa ctaatacagt 
cttcacatat tggctaatag atgtttccgc 
cggtgctccg tctatcaaaa ttgtactaaa 
gattattttg ttttcaggat gcttgcacaa 
ttgaaaatca ccgaatctac gaagagccag 
tggtatgaca gcatttactt tgataattta 
ttagaatgtc ctcagaaggc agcactcctt 
atatccatta actggccttt gatcgctgca 
gttgttaata atgcattttc agagttggat 
ccacaaagaa tctctggtcc gtgcaaattc 
ggtggattat atttattcga tattgtcctg 
aggtttgcag ttcgattgct ttggtactaa 
agtaaatgaa agtctctcag gggctacagt 
gcgaaatagc tcagaggatg aagagtgcat 
agcaacaaga gatatcgaaa agactattag 
gcaaaagctt gtacaagagc atgaggaaaa 
caagaagcag aaacttgaaa ctagtaaaag 
ttcacggacc agtactcaag tgggtgatct 
gtttgatgaa atcaaaagtg agaaaaatga 
ggttgcaaag aagaagttgg ctgaggatga 
ggcagctaaa ttaaaagttt gtgttcccat 
ttcatcaaac atttcccaaa atgctcctga 
agctacttac gctgatacga attgcatggc 
aaacacatta ggaaccatgt cgggtggcag 
aagaaatgac gagacaatgg atgtctcagc 
gtccaatgag cacgcttcta tcactgtgcc 
ggaatttgcg gccttgaacg tgcatttgtc 
tgcggcatca gatgaagatg tttcatcaag 
tctttctgcc tcccccgagt tttctctaaa 
tagaagaaca agtcatgtgg gttttgatac 
agattgttct cttgaccaag agattcctga 
gtctgtggta gagactaggg gtgctgctga 
atagttgatg ccttgttcat ttaatctttt 
acgggtagtg atcagatgtc tttttttctc 
tcaagatata tgtcctatgc cttcttcact 
cactgagagc gaaaatcttg aagaagcaat 
agagactact gattttgctg catcacatca 
agtttaaagt tatcaatctg tgttatgttc 
aaatgtggtt actgatcaag tctcgttgtt 
caagttacat gtcctttgct atcttcaccg 
attgaaggcc aaaatatcaa cacatcagct 
gagagtggtg attatgcagt aatagatcag 
ttgttgttta aaagtcttac atctttgtaa 
tgggtgctca ggatgcatgc tctctgccat 
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accttgctga ctanctttcc tacaggggct: 4930 
tcatcatgta aatct^ctgg tacagatggt 5040 
tattctatgg gcagccatat ctctggaatc 5100 
gggaaaatac cggatgaatc acagaggagg 5160 
aaactttgcc aagttttgca tctttcagta 5220 
tagccttgca tatgcttata tatagctgat 5230 
tacagccaaa tgtgagagtt ttattcttca 5340 
ttcttatatc tggcttcgtt accaatgcac 5400 
gtaggtagtt atgtgcctca tgtcattcat 5460 
ctttacatga tccatctgtg ggagtatata 552 0 
gctgatcaaa gacatgtcaa aagtttatct 5580 
gtctgatgct attttaaggt agtcttatgt 5640 
tgtcgtgtcc atatacttct gtgattatca 5700 
aggtattttg caatgtgtga ttggttaaca 5760 
gcatggtcgg aaattttctc gaatatgtta 5820 
ccactacttt tcaggcattc cagatagccc 5880 
tgcattgttt ccttcatcat ctgcctttgt 5940 
tagttttaac tttccaatca taggattcaa 6000 
taatatatga atagttgaca tactgaatac 6060 
tgcagccttg ttggtaaagc aaattcttag 6120 
tgaattagct ttcaaatgct ctagagtaga 6180 
catgaagagt ctgttcctgg agcatacaca 6240 
ttctaaacag tcagtggtta gcacaaaact 6300 
gcgtgacgaa aagattaata cgaagtcgat 6360 
gactgagaag agatgtagcc attatagcac 6420 
tggcataaaa aagaaataca agaagcaagt 6480 
gaaaatggag ctgttaaata tgtatgcaga 6540 
tgtggaagca gcagtaattc gtattacctg 6600 
caaactgctg gatcataatt atgaaagaaa 6660 
atgcctcaaa agtctggagc aaatgcacga 6720 
agcctgttgg attaatcgga taaagagctg 6780 
tcaaagtggc aataacaagc attttagtgg 6840 
tgtacaaatt tgcaataatg ctaacgttga 6900 
ttccaaggtt aatcaagtgc cagaagcaga 6960 
cactcaacaa gttcatgaaa tggtggatgt 7020 
tttgtctcgt gaacagctta caaagagcca 7080 
tgagattttg attcctgctg actgtcaaga 7140 
agaagaccag aattgtgaca gaataacatc 7200 
ggtgccagag gtatcccagt cactcgaaaa 7260 
tagagaggag gctttggtta caacagaaaa 7320 
tgataacatt ttggaccagc agaatagaga 7380 
cgagttagcg atgcctgtgc aacatcttgc 7440 
atctgatcag gtacttactg gccctgtaga 7500 
ctaatgttca ttcttgcttt cttgaaaata 7560 
ttattaaatt cacttttctg gacagtatgg 762 0 
ggctggaaag caacctgacc cagcagcaaa 7680 
tgagcctcag tctgctggtt cagaaacagt 7740 
ggtccctatt gaagactttc cttttttact 7800 
attctaagtt tccgtgagaa aaaggtgggg 7860 
gttttaaatc gactcttttg acagggtgat 792 0 
actggaaatc agcctgcgcc agaagcaaat 7980 
gagccccatg tagcgggtcc agatgcagta 8040 
gttattgcct taactaaaga caaatgtctt 8100 
tgctcgttct ggatatcctg caggaaacaa 8160 
ctggatcggt tggaactcag tctgacctag 8220 
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* gagcaaacat tgagggtcaa aatgtcacaa cagtggctca acttcccaca gatggatcag 8280 
atgcagttgt aaccggtgga tctcctgtat cagatcaggt acctgcctct gctcaaggac 8340 
tttcttatgt gttggtttaa aggtctagtc cttagtaatg ttgaaactaa gcaaacagtg 8400 
gatagtgatc atatggttat ttttgcttgt gaatttaata tttctggaca gtgtgcccag 8460 
gatgcatctc ctatgccatt atcttcgcct ggaaatcacc ctgatacagc agttaatatc 8520 
gagggtttag ataacacatc agtagctgag cctcatataa gtggatcaga tgcatgtgaa 8580 
atggaaattt cagaacctgg tccccaagta gagcggtcaa cctttgcaag tcagtaactg 8640 
ccttgggcat ttttaagtat cacctaggtc gacatatgtg attgccaaac agctaacaag 8700 
gagatgcctt ttgtgcagat cttttccatg aaggtggcgt ggagcattca gcaggtgtaa 8760 
cagctcttgt tccatcactt cttaacaatg gtacggaaca gattgccgtt caacctgttc 8820 
ctcaaatacc tttccctgtg ttcaacgacc cgtttctgca tgaactggag aagttgcgga 8880 
gagaatcaga gaactcaaag aagacttttg aagaaaaagt cagtttccct cattacccag 8940 
ttacctcttg ttttggttta ttttctagct gcccattgac tctcagttgc ttgtgagcag 9000 
aaatcaatct tgaaagctga actcgagagg aagatggctg aagtacaagc agagtttcga 9060 
agaaaatttc atgaggtaga agccgagcat aacaccagaa cgacaaagat agagaaggat 9120 
aagaatcttg ttataatgaa caaactgttg gcgaatgcgt tcttgtccaa atgtactgac 9180 
aagaaggtat ctccctcagg agctccaagg ggtaagtgtc gaataatata gcaaattggt 9240 
tttaaaaata aggcgacgaa gtcataatag cactttttct ccaggtaaaa ttcagcagct 9300 
agcacagaga gcagcacaag tgagtgcact gagaaattac attgctcctc agcagcttca 9360 
ggcatcttct tttcctgctc ctgctctggt ttcggctcct ctgcaacttc agcaatcatc 9420 
atttcctgct cctggtccgg ctcctctgca gcctcaggca tcttcgtttc cttcttcagt 9480 
ctctcgtcca tcagcccttc ttctgaattt tgcggtctgt ccaatgcctc agcccagaca 9540 
gcctctcata tccaacatag ctccaactcc atcagttact cctgcaacaa atccaggtct 9600 
gcgttctcct gcaccacacc taaactcata tagaccatcc tcttcaactc ccgtcgccac 9660 
agctactcca acctcgtcag tgcctcctca agctttgaca tattcagctg tgtcaattca 9720 
gcagcagcaa gaacaacaac cgcaacagag cttgagcagt ggattgcaga gcaacaatga 9780 
agtggtttgt ctttctgacg acgagtgacc taagaggaga gatggttagg gtcttagtta 9840 
ttgattttta gagagttaat aatagtatat atatatatgt ataagtaggt tacctaatct 9900 
ctgtcgttaa tctaatttag tgagtcagga accgactcgt tggctaaggt ctctcctttt 9960 
gaaacgcaac gttctacttt catgtatata aatacagtct gatcacacaa cacaaattga 10020 
tgattgaaaa tactactgat ttaactttat agaaaaccca aattatagag cgacaacttt 10080 
ataaacatgt caaacttcga agttaaaatt taagacccca taattttaca attatagatt 10140 
ttaatactcc aactattttg tgatgttaaa agaagtatcc gagtcttttc tttccagttt 10200 
ccccaccgtc ccatgactcc cccagccagt agaaaaagcc aaaaaagtaa acaaaaagtc 10260 
gttaaaaaag ttaaattaaa aaaaaaatag atagttgacg tttactaaag tgatttgaat 10320 
tgaacaatt 10329 



<210> 2 
<211> 6571 
<212> E3MA 

<213> Arabidopsis thaliana 

<220> 
<221> CDS 

<222> (310) . . (6312) 
<400> 2 

cacaagcatg agtttttcct tccggtaatc gtaaaatcaa aaatcgctcg ctcgagaaac 60 
gccggtgcta gaagaggaaa gtaccgtaca taatcctgcg aacccaattc tcgtcttctt 120 
caaactcagt tttccgaaac cccaaacacc gcgaggattg catggcctga agaaccactt 180 
aatcgagaat tgtgctggaa ttctcaaatt ttccctcgcg tttttctttc acactctcgg 240 
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aatcggaaat ttccaccaag ctccgtcaag cgatagattc tgacaattac acaccrtcgc 3 00 

gcagtgaat atg aag aaa gat: gaa aag att ggt ttg acg ggg aga a:c att 3 51 
Met Lys Lys Asp Giu Lys lie Gly Leu Thr Gly Arg Thr lie 
I 5 10 

tac acc aga tec eta gca get tea att cct gee tea gtt gaa caa gaa 399 
Tyr Thr Arg Ser Leu Ala Ala Ser He Pro Ala Ser Val Glu Gin Glu 
15 20 25 30 

acc cct ggt ttg agg agg tea age egg ggg aca cca tct acg aag gta 447 
Thr Pro Gly Leu Arg Arg Ser Ser Arg Gly Thr Pro Ser Thr Lys Val 
35 40 45 

ata act cca get tct get act aga aag tea gag aga ctg get ccc tea 495 
He Thr Pro Ala Ser Ala Thr Arg Lys Ser Glu Arg Leu Ala Pro Ser 
50 55 60 

cct get tea gtt tea aaa aag tec ggt gga ate gtc aag aat tec aca 543 
Pro Ala Ser Val Ser Lys Lys Ser Gly Gly He Val Lys Asn Ser Thr 
65 70 75 

cca agt tct ttg cga agg tec aat agg ggg aag act gaa gta tec ttg 591 
Pro Ser Ser Leu Arg Arg Ser Asn Arg Gly Lys Thr Glu Val Ser Leu 
80 85 90 

cag agt tec aaa gga tea gat aat tct ate agg aaa gga gat act tea 639 
Gin Ser Ser Lys Gly Ser Asp Asn Ser He Arg Lys Gly Asp Thr Ser 
95 100 105 " HO 

ccg gat att gag cag aga aag gat agt gtt gaa gag teg aca gat aag 687 
Pro Asp He Glu Gin Arg Lys Asp Ser Val Glu Glu Ser Thr Asp Lys 
115 120 125 

ate aag cct ata atg tea gee cga agt tac agg gca ttg ttt aga ggg 735 
He Lys Pro He Met Ser Ala Arg Ser Tyr Arg Ala Leu Phe Arg Gly 
130 135 140 

aag etc aag gaa tct gag gca tta gtt gat get tec cca aat gaa gag 783 
Lys Leu Lys Glu Ser Glu Ala Leu Val Asp Ala Ser Pro Asn Glu Glu 
145 150 155 

gaa eta gta gtt gtt ggt tgt tct cgc cgc ata cct gca ggc aat gat 831 
Glu Leu Val Val Val Gly Cys Ser Arg Arg He Pro Ala Gly Asn Asp 
160 165 170 

gat gtt caa ggt aaa aca gat tgt cca cca cct gca gat gca gga tea 879 
Asp Val Gin Gly Lys Thr Asp Cys Pro Pro Pro Ala Asp Ala Glv Ser 
175 180 185 - 190 

aaa agg ctg cca gtt gac gaa act agt ttg gac aag ggc act gat ttt 927 
Lys Arg Leu Pro Val Asp Glu Thr Ser Leu Asp Lys Gly Thr Aso Phe 
195 200 205 
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I cct ttg aaa tea gtt ac:g gag acc gag aag ata gtg ctt gat gca tec 975 
Pro Leu Lys Ser Val Thr Glu Thr Glu Lys lie Val Leu Asp Ala Ser 
210 215 220 

ccc ata gtt gaa act ggg gat gac agt gtt ata ggt tea cca tct gag 1023 
Pro lie Val Glu Thr Gly Asp Asp Ser Val lie Gly Ser Pro Ser Glu 
225 230 235 

aat tta gag aca caa aag ctt caa gat ggt aag aca gat tgt tea cca 1071 
Asn Leu Glu Tlrr Gin Lys Leu Gin Asp Gly Lys Thr Asp Cys Ser Pro 
240 245 250 

cct gca aat gca gaa teg aaa acg ctg cca gtt ggt gaa act agt tta 1119 
Pro Ala Asn Ala Glu Ser Lys Thr Leu Pro Val Gly Glu Thr Ser Leu 
255 260 265 270 

gaa aaa gaa tat cca caa aag ttt caa gat gat aac aca gat tgt eta 1167 
Glu Lys Glu Tyr Pro Gin Lys Phe Gin Asp Asp Asn Thr Asp Cys Leu 
275 280 ' 285 

cca cct gca aat gca gaa tea aaa agg ctg cca gtt ggc gaa act agt 1215 
Pro Pro Ala Asn Ala Glu Ser Lys Arg Leu Pro Val Gly Glu Thr Ser 
290 295 300 

tta gaa aag gac act gat ttt cct ttg aaa tea act acg gag act gga 1263 
Leu Glu Lys Asp Thr Asp Phe Pro Leu Lys Ser Thr Thr Glu Thr Gly 
305 310 315 

aag atg gtt ctt tat gca tec ccc ata gtt gaa act agg gat gac age 1311 
Lys Met Val Leu Tyr Ala Ser Pro lie Val Glu Thr Arg Asp Asp Ser 
320 325 330 

gtt ata tgt tea cca tct aca aat tta gaa acc caa aag ctt ctt gtc 1359 
Val lie Cys Ser Pro Ser Thr Asn Leu Glu Thr Gin Lys Leu Leu Val 
335 . 340 345 350 

agt aaa act ggc tta gaa acc gac ata gtt ttg cct ttg aaa aga aaa 1407 
Ser Lys Thr Gly Leu Glu Thr Asp He Val Leu Pro Leu Lvs Ara Lys 
355 360 " 365 

aga gac act gca gaa att gag ctg gat gca tgt get aca gtt gca aat 1455 
Arg Asp Thr Ala Glu He Glu Leu Asp Ala Cys Ala Thr Val Ala Asn 
370 375 380 

gga gat gat cac gtt atg agt tct gat ggg gtc att cca tct cca tct 1503 
Gly Asp Asp His Val Met Ser Ser Asp Gly Val He Pro Ser Pro Ser 
385 390 395 

ggg tgc aaa aat gat aat cga cct gaa atg tgc aac acg tgt aaa aaa 1551 
Gly Cys Lys Asn Asp Asn Arg Pro Glu Met Cys Asn Thr Cys Lys Lys 
400 405 410 

egg caa aag gtc aac ggt gat tgt caa aat agg agt gtt tgc tec tgc 1599 
Arg Gin Lys Val Asn Gly Asp Cys Gin Asn Arg Ser Val Cys Ser Cys 
415 420 425 430 
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att gtc cag cca get gaa gaa cct gat aac gtg aca cag gat atg aaa 1547 
lie Val Gin Pro Val GIu Glu Ser Asp Asn Yal Thr Gin Asp Met Lys 
435 440 " 445 

gaa act gga cca gtt acg age aga gaa tat gag gag aac ggg caa ata 1695 
Glu Thr Gly Pro Val Thr Ser Arg Glu Tyr Glu Glu Asn Glv Gin lie 
450 455 450 

caa cat ggt aaa tea agt gat ccc aaa ttc tat tct teg gtg tac cca 1743 
Gin His Gly Lys Ser Ser Asp Pro Lys Phe Tyr Ser Ser Val Tyr Pro 
455 470 475 

gag tat tgg gtt cct gtg cag eta tea gat gta cag ctg gag caa tac 1791 
Glu Tyr Trp Val Pro Val Gin Leu Ser Asp Val Gin Leu Glu Gin Tyr 
480 485 490 

tgt cag act etc ttc tec aaa tec tta tct ctt tct tea ctt teg aag 1839 
Cys Gin Thr Leu Phe Ser Lys Ser Leu Ser Leu Ser Ser Leu Ser Lys 
495 500 505 510 

att gat ctt gga get eta gaa gaa act etc aat tct gta aga aaa ace 1887 
lie Asp Leu Gly Ala Leu Glu Glu Thr Leu Asn Ser Val Arg Lys Thr 
515 520 ~ 525 

tgt gac cat cca tac gtt atg gat gca tct ttg aaa caa ctg etc ace 1935 
Cys Asp His Pro Tyr Val Met Asp Ala Ser Leu Lys Gin Leu Leu Thr 
530 535 540 

aag aat ctg gag ttg cat gaa ate ctg gat gta gaa att aaa gcg age 1983 
Lys Asn I^au Glu Leu His Glu He Leu Asp Val Glu He Lys Ala Ser 
545 550 555 

ggg aaa ctt cac etc ctt gat aaa atg ctt act cat ata aaa aag aat 2031 
Gly Lys Leu His Leu Leu Asp Lys Met Leu Thr His He Lys Lys Asn 
560 565 570 

ggt tta aaa gca gtt gtc ttc tac cag gca aca caa ace cct gaa ggg 2079 
Gly Leu Lys Ala Val Val Phe Tyr Gin Ala Thr Gin Thr Pro Glu Gly 
575 580 585 590 

ctt ctg ctt ggt aat att etc gaa gat ttt gtg ggt caa aga ttt ggt 2127 
Leu Leu Leu Gly Asn He Leu Glu Asp Phe Val Gly Gin Arg Phe Gly 

595 600 " 605 

cca aaa tct tat gag cat ggg ata tat tec tea aag aag aac tec get 2175 
Pro Lys Ser Tyr Glu His Gly He Tyr Ser Ser Lys Lys Asn Ser Ala 
610- 615 620 

ata aac aat ttc aac aag gag agt caa tgc tgt gtt ctg ctg ttg gaa 2223 
He Asn Asn Phe Asn Lys Glu Ser Gin Cys Cys Val Leu Leu Leu Glu 
625 630 635 

aca cgt gec tgc agt caa ace att aaa etc ttg cga get gat gcg ttt 2271 
Thr Arg Ala Cys Ser Gin Thr He Lys Leu Leu Arg Ala Asp Ala Phe 
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► 640 645 650 

att ctt ttt gga age age ttg aat cca teg cat gat gtt aag cac gta 2319 

lie Leu Phe Gly Ser Ser Leu Asn Pro Ser His Asp Val Lys His Val 
655 660 665 670 

gag aag ata aaa ate gag tea tgt tct gaa aga act aag ata ttc cga 2367 

Glu Lys lie Lys lie Glu Ser Cys Ser Glu Arg Thr Lys lie Phe Arg 

675 680 685 

ttg tac tea gta tgt aca gtt gaa gaa aaa gee ctg att ctg get agg 2415 

Leu Tyr Ser Val Cys Thr Val Glu Glu Lys Ala Leu lie Leu Ala Arg 
690 695 700 

caa aat atg egg caa aat aaa get gta gag aac eta aac cgc tct etc 2463 

Gin Asn Met Arg Gin Asn Lys Ala Val Glu Asn Leu Asn Arg Ser Leu 
705 710 715 

acg cac gca ctg etc atg tgg ggg gcg tea tac tta ttt gat aaa ctg 2511 

Thr His Ala Leu Leu Met Trp Gly Ala Ser Tyr Leu Phe Asp Lys Leu 
720 725 730 

gat cat ttt cac age agt gaa act cca gat tea gga gtt tea ttt gaa 2559 

Asp His Phe His Ser Ser Glu Thr Pro Asp Ser Gly Val Ser Phe Glu 
735 740 745 750 

caa tct att atg gac ggc gtg att cat gaa ttc teg tec ata ctt tct 2607 

Gin Ser lie Met Asp Gly Val lie His Glu Phe Ser Ser lie Leu Ser 

755 760 765 

tec aaa ggt gga gaa gaa aat gaa gtc aag ctg tgt eta ctt ttg gag 2655 

Ser Lys Gly Gly Glu Glu Asn Glu Val Lys Leu Cys Leu Leu Leu Glu 
770 775 ' 780 

gec aag cat get cag gga act tac age agt gat tct act eta ttt ggt 2703 

Ala Lys His Ala Gin Gly Thr Tyr Ser Ser Asp Ser Thr Leu Phe Gly 
785 790 795 

gaa gac cat att aag ttg tea gat gaa gag agt cca aat ata ttt tgg 2751 

Glu Asp His lie Lys Leu Ser Asp Glu Glu Ser Pro Asn lie Phe Trp 
800 805 810 

tea aag ctg ttg ggg gga aaa aat cct atg tgg aaa tac cct tea gat 2799 

Ser Lys Leu Leu Gly Gly Lys Asn Pro Met Trp Lys Tyr Pro Ser Asp 
815 820 825 ' 830 

act ccc caa agg aat cga aaa cga gtt cag tat ttt gag ggt tct gaa 2847 

Thr Pro Gin Arg Asn Arg Lys Arg Val Gin Tyr Phe Glu Gly Ser Glu 

835 840 ~ 845 

gcg agt ccc aaa act ggc gat ggt gga aat gca aag aag cga aag aag 2895 

Ala Ser Pro Lys Thr Gly Asp Gly Gly Asn Ala Lys Lys Arg Lys Lys 
850 855 860 

get tct gat gat gtc act gat ccc egg gtc act gat ccg cca gta gat 2943 
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Ala Ser Asp Asp Yal :hr Asp Pro Arg Vai I'hr Asp Pro Pro Val Asp 
Sod 370 375 

gat gat gaa aga aag gcc tct ggg aag gat cac atg ggg get ttg gag 2 9 91 
Asp Asp Glu Arg Lys Ala Ser Gly Lys Asp His Met Gly Ala Leu Glu 
830 385 890 

tea cca aaa gtc ata aca etc cag tea tea tgt aaa tct tct ggt aca 3039 
Ser Pro Lys Val lie Thr Leu Gin Ser Ser Cys Lys Ser Ser Gly Thr 
895 9C0 905 ^ ^ 910 

gat ggt aca ttg gat gga aat gat get ttt ggc ttg tat tct atg ggc 3087 
Asp Gly Thr Leu Asp Gly Asn Asp Ala Phe Gly Leu Tyr Ser Met Gly 
915 920 925 

age cat ate tct gga ate cca gag gat atg tta get agt caa gat tgg 3135 
Ser His lie Ser Gly lie Pro Glu Asp Met Leu Ala Ser Gin Asp Trp 
930 935 940 

ggg aaa ata ccg gat gaa tea cag agg agg etc cac act gtt tta aag 3183 
Gly Lys lie Pro Asp Glu Ser Gin Arg Arg Leu His Thr Val Leu Lys 
945 950 955 

ccg aag atg gca aaa ctt tgc caa gtt ttg cat ctt tea gat get tgc 3231 
Pro Lys Met Ala Lys Leu Cys Gin Val Leu His Leu Ser Asp Ala Cys 
960 965 970 

aca age atg gtc gga aat ttt etc gaa tat gtt att gaa aat cac cga 3279 
Thr Ser Met Val Gly Asn Phe Leu Glu Tyr Val He Glu Asn His Arg 

975 980 985 990 

ate tac gaa gag cca gcc act act ttt cag gca ttc cag ata gcc ctg 3327 
He Tyr Glu Glu Pro Ala Thr Thr Phe Gin Ala Phe Gin He Ala Leu 
995 1000 1005 

agt tgg att gca gcc ttg ttg gta aag caa att ctt age cac aaa gaa 3375 
Ser Trp He Ala Ala Leu Leu Val Lys Gin He Leu Ser His Lys Glu 
1010 1015 1020 

tct ctg gtc cgt gca aat tct gaa tta get ttc aaa tgc tct aga gta 3423 
Ser Leu Val Arg Ala Asn Ser Glu Leu Ala Phe Lys Cys Ser Arg Val 
1025 1030 ~ 1035 

gag gtg gat tat att tat teg ata ttg tec tgc atg aag agt ctg ttc 3471 
Glu Val Asp Tyr He Tyr Ser He Leu Ser Cys Met Lys Ser Leu Phe 
1040 1045 1050 

ctg gag cat aca caa ggt ttg cag ttc gat tgc ttt ggt act aat tct 3519 
Leu Glu His Thr Gin Gly Leu Gin Phe Asp Cys Phe Gly Thr Asn Ser 
1055 1060 1065 " 1070 

aaa cag tea gtg gtt age aca aaa eta gta aat gaa agt etc tea ggg 3567 
Lys Gin Ser Val Val Ser Thr Lys Leu Val Asn Glu Ser Leu Ser Gly 
1075 1080 1085 
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I gee aca gtg cgt gac gaa aag att aat acg aag teg atg cga aat age 3615 
Ala Thr Val Arg .Asp Glu Lys lie Asn Thr Lys Ser Met Arg Asn Ser 
1090 1095 1100 

tea gag gat gaa gag tgc atg act gag aag aga tgt age cat tat age 3663 
Ser Glu Asp Glu Glu Cys Met Thr Glu Lys Arg Cys Ser His Tyr Ser 
1105 1110 "* 1115 

aca gca aca aga gat ate gaa aag act att agt ggc ata aaa aag aaa 3711 
Thr Ala Thr Arg Asp lie Glu Lys Thr lie Ser Gly lie Lys Lys Lvs 
1120 1125 1130 

tac aag aag caa gtg caa aag ctt gta caa gag cat gag gaa aag aaa 3759 
Tyr Lys Lys Gin Val Gin Lys Leu Val Gin Glu His Glu Glu Lys Lys 
1135 1140 1145 ' 1150 

atg gag ctg tta aat atg tat gca gac aag aag cag aaa ctt gaa act 3807 
Met Glu Leu Leu Asn Met Tyr Ala Asp Lys Lys Gin Lys Leu Glu Thr 
1155 1160 " 1165 

agt aaa agt gtg gaa gca gca gta att cgt att acc tgt tea egg acc 3855 
Ser Lys Ser Val Glu Ala Ala Val lie Arg lie Thr Cys Ser Arg Thr 
1170 1175 - 1180 

agt act caa gtg ggt gat etc aaa ctg ctg gat cat aat tat gaa aga 3903 
Ser Thr Gin Val Gly Asp Leu Lys Leu Leu Asp His Asn Tyr Glu Arg 
1185 1190 1195 

aag ttt gat gaa ate aaa agt gag aaa aat gaa tgc etc aaa agt ctg 3951 
Lys Phe Asp Glu lie Lys Ser Glu Lys Asn Glu Cys Leu Lys Ser Leu 
1200 1205 1210 

gag caa atg cac gag gtt gca aag aag aag ttg get gag gat gaa gee 3999 
Glu Gin Met His Glu Val Ala Lys Lys Lys Leu Ala Glu Aso Glu Ala 
1215 1220 1225 ~ 1230 

tgt tgg att aat egg ata aag age tgg gca get aaa tta aaa gtt tgt 4047 
Cys Trp lie Asn Arg lie Lys Ser Trp Ala Ala Lys Leu Lys Val Cys 
1235 1240 ~ 1245 

gtt ccc att caa agt ggc aat aac aag cat ttt agt ggt tea tea aac 4095 
Val Pro He Gin Ser Gly Asn Asn Lys His Phe Ser Gly Ser Ser Asn 
1250 1255 1260 

att tec caa aat get cct gat gta caa att tgc aat aat get aac gtt 4143 
He Ser Gin Asn Ala Pro Asp Val Gin He Cys Asn Asn Ala Asn Val 
1265 1270 1275 

gaa get act tac get gat acg aat tgc atg get tec aag gtt aat caa 4191 
Glu Ala Thr Tyr Ala Asp Thr Asn Cys Met Ala Ser Lvs Val Asn Gin 
1280 1285 1290 

gtg cca gaa gca gaa aac aca tta gga acc atg teg ggt ggc age act 4239 
Val Pro Glu Ala Glu Asn Thr Leu Gly Thr Met Ser Gly Gly Ser Thr 
1295 1300 1305 " " 1310 
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caa caa gtt car gaa atg gtg gat gta aga aat gac gag aca atg gat: 4287 
Gin Glu Val His Glu Met Val Asp Val Arg Asn Aso Glu Thr Met Aso 
-315 1320 " 1325 

gtc tea get ttg tec cgt gaa cag ctt aca aag age cag tec aat gag 4335 
Val Ser Ala Leu Ser Arg Glu Gin Leu Thr Lys Ser Gin Ser Asn Glu 
1330' 1335 ' 1340 

cac get tct ate act gtg cct gag att ttg att cct get gac tgt caa 4383 
His Ala Ser He Thr Val Pro Glu He Leu He Pro Ala Asp Cys Gin 
1345 1350 1355 

gag gaa ttt gcg gee ttg aac gtg cat ttg tea gaa gac cag aat tgt 4431 
Glu Glu Phe Ala Ala Leu Asn Val His Leu Ser Glu Asp Gin Asn Cys 
1360 1365 1370 

gac aga ata aca tct gcg gca tea gat gaa gat gtt tea tea agg gtg 4479 
Asp Arg He Thr Ser Ala Ala Ser Asp Glu Asp Val Ser Ser Arg Val 
1375 1380 1385 1390 

cca gag gta tec cag tea etc gaa aat ctt tct gec tec ccc gag ttt 4527 
Pro Glu Val Ser Gin Ser Leu Glu Asn Leu Ser Ala Ser Pro Glu Phe 
1395 1400 1405 

tct eta aat aga gag gag get ttg gtt aca aca gaa aat aga aga aca 4575 
Ser Leu Asn Arg Glu Glu Ala Leu Val Thr Thr Glu Asn Arg Arg Thr 
1410 1415 1420 

agt cat gtg ggt ttt gat act gat aac att ttg gac cag cag aat aga 4623 
Ser His Val Gly Phe Asp Thr Asp Asn He Leu Asp Gin Gin Asn Arg 
1425 1430 1435 

gaa gat tgt tct ctt gac caa gag att cct gac gag tta gcg atg cct 4671 
Glu Asp Cys Ser Leu Asp Gin Glu He Pro Asp Glu Leu Ala Met Pro 
1440 1445 1450 

gtg caa cat ctt gcg tct gtg gta gag act agg ggt get get gaa tct 4719 
Val Gin His Leu Ala Ser Val Val Glu Thr Arg Gly Ala Ala Glu Ser 
1455 1460 1465 1470 

gat cag tat ggt caa gat ata tgt cct atg cct tct tea ctg get gga 4767 
Asp Gin Tyr Gly Gin Asp He Cys Pro Met Pro Ser Ser Leu Ala Gly 
1475 1480 1485 

aag caa cct gac cca gca gca aac act gag age gaa aat ctt gaa gaa 4815 
Lys Gin Pro Asp Pro Ala Ala Asn Thr Glu Ser Glu Asn Leu Glu Glu 
1490 1495 1500 

gca att gag cct cag tct get ggt tea gaa aca gta gag act act gat 4863 
Ala He Glu Pro Gin Ser Ala Gly Ser Glu Thr Val Glu Thr Thr Asp 
1505 1510 1515 

ttt get gca tea cat cag ggt gat caa gtt aca tgt cct ttg eta tct 4911 
Phe Ala Ala Ser His Gin Gly Asp Gin Val Thr Cys Pro Leu Leu Ser 
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# 1520 1525 1530 

tea ccg act gga aat cag cct gcg cca gaa gca aat att gaa ggc caa 4 959 
Ser Pro Thr Gly Asn Gin Pro Ala Pro Glu Ala Asn lie Glu Gly Gin 
1535 1540 1545 1550 

aat ate aac aca tea get gag ccc cat gta gcg ggt cca gat gca gta 5007 
Asn lie Asn Thr Ser Ala Glu Pro His Val Ala Gly Pro Asp Ala Val 
1555 1560 1565 

gag agt ggt gat tat gca gta ata gat cag gaa aca atg ggt get cag 5055 
Glu Ser Gly Asp Tyr Ala Val He Asp Gin Glu Thr Met Gly Ala Gin 
1570 1575 1580 

gat gca tgc tct ctg cca tct gga teg gtt gga act cag tct gac eta 5103 
Asp Ala Cys Ser Leu Pro Ser Gly Ser Val Gly Thr Gin Ser Asp Leu 
1585 1590 1595 

gga gca aac att gag ggt caa aat gtc aca aca gtg get caa ctt ccc 5151 
Gly Ala Asn lie Glu Gly Gin Asn Val Thr Thr Val Ala Gin Leu Pro 
1600 1605 1610 

aca gat gga tea gat gca gtt gta ace ggt gga tct cct gta tea gat 5199 
Thr Asp Gly Ser Asp Ala Val Val Thr Gly Gly Ser Pro Val Ser Asp 
1615 1620 1625 1630 

cag tgt gee cag gat gca tct cct atg cca tta tct teg cct gga aat 5247 
• Gin Cys Ala Gin Asp Ala Ser Pro Met Pro Leu Ser Ser Pro Gly Asn 
1635 1640 1645 

cac cct gat aca gca gtt aat ate gag ggt tta gat aac aca tea gta 5295 
His Pro Asp Thr Ala Val Asn He Glu Gly Leu Asp Asn Thr Ser Val 
1650 1655 ^ 1660 

get gag cct cat ata agt gga tea gat gca tgt gaa atg gaa att tea 5343 
Ala Glu Pro His He Ser Gly Ser Asp Ala Cys Glu Met Glu He Ser 
1665 1670 1675 

gaa cct ggt ccc caa gta gag egg tea ace ttt gca aat ctt ttc cat 5391 
Glu Pro Gly Pro Gin Val Glu Arg Ser Thr Phe Ala Asn Leu Phe His 
1680 1685 1690 

gaa ggt ggc gtg gag cat tea gca ggt gta aca get ctt gtt cca tea 543 9 
Glu Gly Gly Val Glu His Ser Ala Gly Val Thr Ala Leu Val Pro Ser 
1^95 1700 1705 1710 

ctt ctt aac aat ggt acg gaa cag att gee gtt caa cct gtt cct caa 5487 
Leu Leu Asn Asn Gly Thr Glu Gin He Ala Val Gin Pro Val Pro Gin 
1715 1720 1725 

ata cct ttc cct gtg ttc aac gac ccg ttt ctg cat gaa ctg gag aag 553 5 
He Pro Phe Pro Val Phe Asn Asp Pro Phe Leu His Glu Leu Glu Lys 
1730 1735 1740 

ttg egg aga gaa tea gag aac tea aag aag act ttt gaa gaa aaa aaa 5583 
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-erg o_u o-r o-u Asn Ser Lys Lys ^nr ?h- 
"45 175C 



tea ate teg aaa get: gaa etc gag agg aag atg gee gaa gea caa gea 5 531 
Ser lie Leu Lys Ala Glu Leu Glu Arg Lys Met Ala Glu Val Gin Ala 
1760 1755 1770 

gag ttt cga aga aaa ttt cat gag gta gaa gee gag cat aae ace aga 5679 
Glu Phe Arg Arg Lys Phe His Glu Val Glu Ala Glu His Asn Thr Arg 
1775 1730 1785 1790 

acg aca aag ata gag aag gat aag aat ctt gtt ata atg aac aaa ctg 5727 
Thr Thr Lys He Glu Lys Asp Lys Asn Leu Val He Met Asn Lys Leu 
1795 1800 1805 

ttg gcg aat gcg ttc ctg tec aaa tgt act gac aag aag gta tct ccc 5775 
Leu Ala Asn Ala Phe Leu Ser Lys Cys Thr Asp Lys Lys Val Ser Pro 
1810 1815 " 1820 

tea gga get cca agg ggt aaa att cag cag eta gea cag aga gea gea 5823 
Ser Gly Ala Pro Arg Gly Lys He Gin Gin Leu Ala Gin Arg Ala Ala 
1825 1830 1835 

caa gtg agt gea ctg aga aat tac att get cct cag cag ctt cag gea 5871 
Gin Val Ser Ala Leu Arg Asn Tyr He Ala Pro Gin Gin Leu Gin Ala 
1840 1845 1850 

tct tct ttt cct get cct get ctg gtt teg get cct ctg caa ctt cag 5919 
Ser Ser Phe Pro Ala Pro Ala Leu Val Ser Ala Pro Leu Gin Leu Gin 
1855 1860 1865 1870 

caa tea tea ttt. cct get cct ggt ccg get cct ctg cag cct cag gea 5967 
Gin Ser Ser Phe Pro Ala Pro Gly Pro Ala Pro Leu Gin Pro Gin Ala 
1875 1880 1885 

tct teg ttt cct tct tea gtc tct cgt cca tea gee ctt ctt ctg aat 6015 
Ser Ser Phe Pro Ser Ser Val Ser Arg Pro Ser Ala Leu Leu Leu Asn 
1890 1895 1900 

ttt gcg gtc tgt cca atg cct cag ccc aga cag cct etc ata tec aac 6063 
Phe Ala Val Cys Pro Met Pro Gin Pro Arg Gin Pro Leu He Ser Asn 
1905 1910 1915 

ata get cca act cca tea gtt act cct gea aca aat cca ggt ctg cgt 6111 
He Ala Pro Thr Pro Ser Val Thr Pro Ala Thr Asn Pro Gly Leu Arg 
1920 1925 1930 

tct cct gea cca cac eta aac tea tat aga cca tec tct tea act ccc 6159 
Ser Pro Ala Pro His Leu Asn Ser Tyr Arg Pro Ser Ser Ser Thr Pro 
1935 1940 1945 1950 

gtc gec aca get act cca ace teg tea gtg cct cct caa get ttg aca 6207 
Val Ala Thr Ala Thr Pro Thr Ser Ser Val Pro Pro Gin Ala Leu Thr 
1955 I960 1965 
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W tat tea get gtg tea att cag cag cag caa gaa caa caa ccg caa cag 62 55 
Tyr Ser Ala Val Ser lie Gin Gin Gin Gin Glu Gin Gin Pro Gin Gin 
1970 1975 1980 

age ttg age agt gga ttg cag age aac aat gaa gtg gtt tgt ctt tct 63 03 
Ser Leu Ser Ser Gly Leu Gin Ser Asn Asn Glu Val Val Cys Leu Ser 
1985 1990 1995 

gac gac gag tgacctaaga ggagagatgg ttagggtctt agttattgat 6352 
Asp Asp Glu 
2000 

ttttagagag ttaataatag tatatatata tatgtataag taggttacct aatctctgtc 6412 

gttaatctaa tttagtgagt caggaaccga ctcgttggct aaggtctctc cttttgaaac 6472 

geaaegttet actttcatgt atataaatac agtctgatca cacaacacaa attgatgatt 6532 

gaaaatacta ctgatttaac ttaaaaaaaa aaaaaaaaa 6571 

<210> 3 
<211> 2001 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 3 

Met Lys Lys Asp Glu Lys lie Gly Leu Thr Gly Arg Thr lie Tyr Thr 
1 5 10 " " 15 

Arg Ser Leu Ala Ala Ser He Pro Ala Ser Val Glu Gin Glu Thr Pro 
20 25 30 

Gly Leu Arg Arg Ser Ser Arg Gly Thr Pro Ser Thr Lys Val He Thr 
35 40 45 

Pro Ala Ser Ala Thr Arg Lys Ser Glu Arg Leu Ala Pro Ser Pro Ala 
50 55 60 

Ser Val Ser Lys Lys Ser Gly Gly He Val Lys Asn Ser Thr Pro Ser 
65 70 75 80 

Ser Leu Arg Arg Ser Asn Arg Gly Lys Thr Glu Val Ser Leu Gin Ser 
85 90 95 

Ser Lys Gly Ser Asp Asn Ser He Arg Lys Gly Asp Thr Ser Pro Aso 
100 105 HO 

He Glu Gin Arg Lys Asp Ser Val Glu Glu Ser Thr Asp Lys He Lys 
H5 120 125 

Pro He Met Ser Ala Arg Ser Tyr Arg Ala Leu Phe Arg Gly Lys Leu 
130 135 140 

Lys Glu Ser Glu Ala Leu Val Asp Ala Ser Pro Asn Glu Glu Glu Leu 
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15 3 I5rj 150 

Val Val Val Gly Cys Ser Arg Arg He Pro Ala Gly Asn Asp Asp Val 
155 170 175 

Gin Gly Lys Thr Asp Cys Pro Pro Pro Ala Asp Ala Gly Ser Lys Arg 
ISO 185 190 

Leu Pro Val Asp Glu Thr Ser Leu Asp Lys Gly Thr Asp Phe Pro Leu 
195 200 205 

Lys Ser Val Thr Glu Thr Glu Lys He Val Leu Asp Ala Ser Pro He 
210 215 220 

Val Glu Thr Gly Asp Asp Ser Val He Gly Ser Pro Ser Glu Asn Leu 
225 230 235 240 

Glu Thr Gin Lys Leu Gin Asp Gly Lys Thr Asp Cys Ser Pro Pro Ala 
245 250 255 

Asn Ala Glu Ser Lys Thr Leu Pro Val Gly Glu Thr Ser Leu Glu Lys 

260 265 270 

Glu Tyr Pro Gin Lys Phe Gin Asp Asp Asn Thr Asp Cys Leu Pro Pro 
275 280 " 285 

Ala Asn Ala Glu Ser Lys Arg Leu Pro Val Gly Glu Thr Ser Leu Glu 
290 295 300 

Lys Asp Thr Asp Phe Pro Leu Lys Ser Thr Thr Glu Thr Gly Lys Met 
305 310 315 320 

Val Leu Tyr Ala Ser Pro He Val Glu Thr Arg Asp Asp Ser Val He 
325 330 ~ * 335 

Cys Ser Pro Ser Thr Asn Leu Glu Thr Gin Lys Leu Leu Val Ser Lys 
340 345 350 

Thr Gly Leu Glu Thr Asp He Val Leu Pro Leu Lys Arg Lys Arg Asp 
355 360 365 

Thr Ala Glu He Glu Leu Asp Ala Cys Ala Thr Val Ala Asn Gly Asp 
370 375 380 

Asp His Val Met Ser Ser Asp Gly Val He Pro Ser Pro Ser Gly Cys 
385 390 395 400 

Lys Asn Asp Asn Arg Pro Glu Met Cys Asn Thr Cys Lys Lys Arg Gin 
405 410 - ~ 415 

Lys Val Asn Gly Asp Cys Gin Asn Arg Ser Val Cys Ser Cys He Val 
420 425 430 

Gin Pro Val Glu Glu Ser Asp Asn Val Thr Gin Asp Met Lys Glu Thr 
435 440 445 
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Gly Pro Val Thr Ser Arg Glu Tyr Glu Glu Asn Gly Gin He Gin His 
450 455 460 

Gly Lys Ser Ser Asp Pro Lys Phe Tyr Ser Ser Val Tyr Pro Glu Tyr 
465 470 475 480 

Trp Val Pro Val Gin Leu Ser Asp Val Gin Leu Glu Gin Tyr Cys Gin 
485 490 495 

Thr Leu Phe Ser Lys Ser Leu Ser Leu Ser Ser Leu Ser Lys He Asp 
500 505 510 

Leu Gly Ala Leu Glu Glu Thr Leu Asn Ser Val Arg Lys Thr Cys Asp 
515 520 525 

His Pro Tyr Val Met Asp Ala Ser Leu Lys Gin Leu Leu Thr Lys Asn 
530 535 540 

Leu Glu Leu His Glu He Leu Asp Val Glu He Lys Ala Ser Gly Lys 
545 550 555 ~ * 560 

Leu His Leu Leu Asp Lys Met Leu Thr His He Lys Lys Asn Gly Leu 
565 570 575 

Lys Ala Val Val Phe Tyr Gin Ala Thr Gin Thr Pro Glu Gly Leu Leu 
580 585 590 

Leu Gly Asn He Leu Glu Asp Phe Val Gly Gin Arg Phe Gly Pro Lys 
595 600 605 

Ser Tyr Glu His Gly He Tyr Ser Ser Lys Lys Asn Ser Ala He Asn 
610 615 620 

Asn Phe Asn Lys Glu Ser Gin Cys Cys Val Leu Leu Leu Glu Thr Arg 
625 630 635 640 

Ala Cys Ser Gin Thr He Lys Leu Leu Arg Ala Asp Ala Phe He Leu 
645 650 655 

Phe Gly Ser Ser Leu Asn Pro Ser His Asp Val Lys His Val Glu Lys 
660 665 670 

He Lys He Glu Ser Cys Ser Glu Arg Thr Lys He Phe Arg Leu Tyr 
675 680 685 

Ser Val Cys Thr Val Glu Glu Lys Ala Leu He Leu Ala Arg Gin Asn 
690 695 700 

Met Arg Gin Asn Lys Ala Val Glu Asn Leu Asn Arg Ser Leu Thr His 
705 710 715 720 

Ala Leu Leu Met Trp Gly Ala Ser Tyr Leu Phe Asp Lys Leu Asp His 



725 



730 



735 



S-31005P1 



- 18 - 



Phe His Ser Ser Glu _hr Pro Asp Ser Gly Val Ser ?he Glu Gin Ser 

■ -4 74 5 75 j 

lie Met: Asp Gly Val lie His Glu ?he Ser Ser lie Leu Ser Ser Lvs 
"55 750 765 

Gly Gly Glu Glu Asn Glu Val Lys Leu Cys Leu Leu Leu Glu Ala Lvs 
770 775 780 

His Ala Gin Gly Thr Tyr Ser Ser Asp Ser Thr Leu Phe Gly Glu ^sp 
785 790 795 300 

His He Lys Leu Ser Asp Glu Glu Ser Pro Asn He Phe Tro Ser Lvs 
,805 810 " 815 

Leu Leu Gly Gly Lys Asn Pro Met Trp Lys Tyr Pro Ser Aso Thr Pro 
820 825 830 

Gin Arg Asn Arg Lys Arg Val Gin Tyr Phe Glu Gly Ser Glu Ala Ser 
835 840 845 

Pro Lys Thr Gly Asp Gly Gly Asn Ala Lys Lys Arg Lys Lys Ala Ser 
850 855 860 

Asp Asp Val Thr Asp Pro Arg Val Thr Asp Pro Pro Val Aso Aso Asp 
865 870 875 " ' 880 

Glu Arg Lys Ala Ser Gly Lys Asp His Met Gly Ala Leu Glu Ser Pro 
885 890 895 

Lys Val He Thr Leu Gin Ser Ser Gys Lys Ser Ser Gly Thr Asp Gly 
900 905 910 

Thr Leu Asp Gly Asn Asp Ala Phe Gly Leu Tyr Ser Met Gly Ser His 
915 920 925 

He Ser Gly He' Pro Glu Asp Met Leu Ala Ser Gin Asp Trp Gly Lys 
930 935 940 

He Pro Asp Glu Ser Gin Arg Arg Leu His Thr Val Leu Lys Pro Lys 
945 950 955 " 960 

Met Ala Lys Leu Gys Gin Val Leu His Leu Ser Asp Ala Gys Thr Ser 
965 970 975 

Met Val Gly Asn Phe Leu Glu Tyr Val He Glu Asn His Arg He Tyr 
980 985 990 

Glu Glu Pro Ala Thr Thr Phe Gin Ala Phe Gin He Ala Leu Ser Trp 
995 1000 1005 

He Ala Ala Leu Leu Val Lys Gin He Leu Ser His Lys Glu Ser Leu 
1010 1015 1020 

Val Arg Ala Asn Ser Glu Leu Ala Phe Lys Cys Ser Arg Val Glu Val 
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#025 



1030 



1035 



1040 



Asp Tyr lie Tyr Ser lie Leu Ser Cys Met Lys Ser Leu Phe Leu Glu 
1045 1050 1055 

Kis Thr Gin Gly Leu Gin Phe Asp Cys Phe Gly Thr Asn Ser Lys Gin 
1060 1065 1070 

Ser Val Val Ser Thr Lys Leu Val Asn Glu Ser Leu Ser Gly Ala Thr 
1075 1080 1085 

Val Arg Asp Glu Lys lie Asn Thr Lys Ser Met Arg Asn Ser Ser Glu 
1090 1095 1100 

Asp Glu Glu Cys Met Thr Glu Lys Arg Cys Ser His Tyr Ser Thr Ala 
105 1110 1115 1120 

Thr Arg Asp lie Glu Lys Thr lie Ser Gly lie Lys Lys Lys Tyr Lys 
1125 1130 1135 

Lys Gin Val Gin Lys Leu Val Gin Glu His Glu Glu Lys Lys Met Glu 
1140 1145 1150 

Leu Leu Asn Met Tyr Ala Asp Lys Lys Gin Lys Leu Glu Thr Ser Lys 
1155 1160 1165 

Ser Val Glu Ala Ala Val lie Arg lie Thr Cys Ser Arg Thr Ser Thr 
1170 1175 1180 

Gin Val Gly Asp Leu Lys Leu Leu Asp His Asn Tyr Glu Arg Lys Phe 
185 1190 1195 1200 

Asp Glu lie Lys Ser Glu Lys Asn Glu Cys Leu Lys Ser Leu Glu Gin 
1205 1210 1215 

Met His Glu Val Ala Lys Lys Lys Leu Ala Glu Asp Glu Ala Cys Trp 
1220 1225 1230 

lie Asn Arg lie Lys Ser Trp Ala Ala Lys Leu Lys Val Cys Val Pro 
1235 1240 1245 

lie Gin Ser Gly Asn Asn Lys His Phe Ser Gly Ser Ser Asn lie Ser 
1250 1255 1260 

Gin Asn Ala Pro Asp Val Gin He Cys Asn Asn Ala Asn Val Glu Ala 
265 1270 1275 1280 

Thr Tyr Ala Asp Thr Asn Cys Met Ala Ser Lys Val Asn Gin Val Pro 
1285 1290 1295 

Glu Ala Glu Asn Thr Leu Gly Thr Met Ser Gly Gly Ser Thr Gin Gin 
1300 1305 "* 1310 

Val His Glu Met Val Asp Val Arg Asn Asp Glu Thr Met Asp Val Ser 
1315 1320 1325 
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1330 



1335 



hr Lys Ser Gin Ser Asn Glu His Ala 
1340 



Ser lie Thr Val Pro Glu lie Leu lie Pro Ala Asp Cys Gin Glu Glu 
345 1353 1355 1350 

Phe Ala Ala Leu Asn Val His Leu Ser Glu Asp Gin Asn Cys Asp A^rg 
1365 1370 1375 

lie Thr Ser Ala Ala Ser Asp Glu Asp Val Ser Ser Arg Val Pro Glu 
1380 1385 1390 

Val Ser Gin Ser Leu Glu Asn Leu Ser Ala Ser Pro Glu Phe Ser Leu 
1395 1400 1405 

Asn Arg Glu Glu Ala Leu Val Thr Thr Glu Asn Arg Arg Thr Ser His 
1410 1415 1420 

Val Gly Phe Asp Thr Asp Asn He Leu Asp Gin Gin Asn Arg Glu Asp 
425 1430 1435 1440 

Cys Ser Leu Asp Gin Glu He Pro Asp Glu Leu Ala Met Pro Val Gin 
1445 1450 1455 

His Leu Ala Ser Val Val Glu Thr Arg Gly Ala Ala Glu Ser Aso Gin 
1460 1465 1470 

Tyr Gly Gin Asp He Cys Pro Met Pro Ser Ser Leu Ala Gly Lys Gin 
1475 1480 1485 

Pro Asp Pro Ala Ala Asn Thr Glu Ser Glu Asn Leu Glu Glu Ala He 
1490 1495 1500 

Glu Pro Gin Ser Ala Gly Ser Glu Thr Val Glu Thr Thr Asp Phe Ala 
505 1510 1515 ~ 1520 

Ala Ser His Gin Gly Asp Gin Val Thr Cys Pro Leu Leu Ser Ser Pro 
1525 1530 1535 

Thr Gly Asn Gin Pro Ala Pro Glu Ala Asn He Glu Gly Gin Asn He 
1540 1545 1550 

Asn Thr Ser Ala Glu Pro His Val Ala Gly Pro Asp Ala Val Glu Ser 
1555 1550 1565 

Gly Asp Tyr Ala Val He Asp Gin Glu Thr Met Gly Ala Gin Aso Ala 
1570 1575 1580 

Cys Ser Leu Pro Ser Gly Ser Val Gly Thr Gin Ser Asp Leu Gly Ala 
585 1590 1595 " 1600 

Asn He Glu Gly Gin Asn Val Thr Thr Val Ala Gin Leu Pro Thr Asp 



1605 



1610 



1615 
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PGly Ser Asp Ala Val Val Thr Gly Gly Ser Pro Val Ser Asp Gin Cys 
1620 1625 1630 

Ala Gin Asp Ala Ser Pro Met Pro Leu Ser Ser Pro Gly Asn His Pro 
1635 1640 1645 

Asp Thr Ala Val Asn lie Glu Gly Leu Asp Asn Thr Ser Val Ala Glu 
1650 1655 1660 

Pro His lie Ser Gly Ser Asp Ala Cys Glu Met Glu lie Ser Glu Pro 
665 1670 1675 1680 

Gly Pro Gin Val Glu Arg Ser Thr Phe Ala Asn Leu Phe His Glu Gly 
1685 1690 1695 

Gly Val Glu His Ser Ala Gly Val Thr Ala Leu Val Pro Ser Leu Leu 
1700 1705 1710 

Asn Asn Gly Thr Glu Gin lie Ala Val Gin Pro Val Pro Gin lie Pro 
1715 1720 1725 

Phe Pro Val Phe 'Asn Asp Pro Phe Leu His Glu Leu Glu Lys Leu Arg 
1730 1735 1740 

Arg Glu Ser Glu Asn Ser Lys Lys Thr Phe Glu Glu Lys Lys Ser lie 
745 1750 1755 " ^ 1760 

Leu Lys Ala Glu Leu Glu Arg Lys Met Ala Glu Val Gin Ala Glu Phe 
1765 1770 1775 

Arg Arg Lys Phe His Glu Val Glu Ala Glu His Asn Thr Arg Thr Thr 
1780 1785 1790 

Lys lie Glu Lys Asp Lys Asn Leu Val lie Met Asn Lys Leu Leu Ala 
1795 1800 1805 

Asn Ala Phe Leu Ser Lys Cys Thr Asp Lys Lys Val Ser Pro Ser Gly 
1810 1815 1820 

Ala Pro Arg Gly Lys He Gin Gin Leu Ala Gin Arg Ala Ala Gin Val 
825 1830 1835 1840 

Ser Ala Leu Arg Asn Tyr He Ala Pro Gin Gin Leu Gin Ala Ser Ser 
1845 1850 1855 

Phe Pro Ala Pro Ala Leu Val Ser Ala Pro Leu Gin Leu Gin Gin Ser 
I860 1865 1870 

Ser Phe Pro Ala Pro Gly Pro Ala Pro Leu Gin Pro Gin Ala Ser Ser 
1875 1880 1885 

Phe Pro Ser Ser Val Ser Arg Pro Ser Ala Leu Leu Leu Asn Phe Ala 
1890 1895 1900 

Val Cys Pro Met Pro Gin Pro Arg Gin Pro Leu He Ser Asn He Ala 



S-31005P1 



- 22 - 



Pro Thr Pro Ser Val Thr Pro Ala Thr Asn Pro Gly Leu Arg Ser Pro 
1925 1930 ~1935 

Ala Pro His Leu .Asn Ser Tyr Arg Pro Ser Ser Ser Thr Pro Val Ala 
1940 1945 1950 

Thr Ala Thr Pro Thr Ser Ser Val Pro Pro Gin Ala Leu Thr Tyr Ser 
1955 I960 1965 

Ala Val Ser He Gin Gin Gin Gin Glu Gin Gin Pro Gin Gin Ser Leu 
1970 1975 1980 

Ser Ser Gly Leu Gin Ser Asn Asn Glu Val Val Cys Leu Ser Asp Asp 
985 1990 1995 " 2000 

Glu 



<210> 4 
<211> 21 
<212> UNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 4 

catctacggc aatgtaccag c 

<210> 5 
<211> 21 
<212> ENA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
Oligonucleotide 

<400> 5 

gatgggaatt ggctgagtgg c 



<210> 6 
<211> 21 
<212> CNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
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) Oligonucleotide 
<400> 6 

cagttccaaa cgtaaaacgg c 

<210> 7 

<211> 15 

<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 7 

ntcgastwts gwgtt 

<210> 8 
<211> 16 
<212> EHMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 8 

ngtcgaswga na^yaa 

<210> 9 
<211> 16 
<212> DMA 

<223> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 9 

wgtgnagwan canaga 

<210> 10 
<2il> 16 
<212> OMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 



<400> 10 



S-31005P1 




- 24 - 



vx: gv;ancv.^g a wang c a 

<210> II 
<21I> 16 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 11 

wcgwwgawca ngncga 

<210> 12 
<2il> 16 
<212> U<lA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 12 

wgcnagtnag wanaag 

<210> 13 
<211> 16 
<212> EKA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 13 

av,gcangncw ganata 

<210> 14 
<211> 24 
<2I2> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 14 

ctgtacatac tgagtacaat cgga 
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0<21O> 15 

• <211> 25 
<212> DMA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 15 

gcttcaattc ctgcctcagt tgaac 



<210> 16 
<211> 24 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 16 

ctctacgtgc ttaacatcat gcga 



<210> 17 
<211> 25 
<212> ENA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 17 

ccagcttctg ctactagaaa gtcag 



<210> 18 
<211> 25 
<212> WA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 18 

ctggagttgc atgaaatcct ggatg 



<210> 19 
<211> 25 
<212> ESsIA 



S-31005P1 




- 26 - 



<2I3> Artificial Sequence 
<22j> 

<223> Description of Artificial Sequence : Syn the tic 
01 i gonuc 1 eo t i de 

<400> 19 

gctctttgta agctgntcac gagac 

<210> 20 
<211> 24 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 20 

tcgcatgatg ttaagcacgt agag 

<210> 21 
<211> 25 
<212> UNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 21 

gagtactggt ccgtgaacag gtaat 

<210> 22 
<211> 25 
<212> EMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 22 

atgcttgcac aagcatggtc ggaaa 

<210> 23 
<211> 25 
<212> DMA 

<213> Artificial Sequence 



<220> 
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^<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 23 

tgcaacatcg tgcatttgct ccaga 

<210> 24 
<211> 25 
<212> WA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 24 

cacaagcatg agtttttcct tccgg 

<210> 25 
<2U> 25 
<212> UNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 



<400> 25 

ctgactttct agtagcagaa gctgg 



